Computerised objective measurement of strain in voiced speech

Farideh Jalalinajafabadi; Chaitanya Gadepalli; Mohsen Ghasempour; Mikel Lujan; Barry Cheetham; Jarrod Homer

doi:10.1109/EMBC.2015.7319659

Computerised objective measurement of strain in voiced speech

Annu Int Conf IEEE Eng Med Biol Soc. 2015 Aug:2015:5589-92. doi: 10.1109/EMBC.2015.7319659.

Authors

Farideh Jalalinajafabadi, Chaitanya Gadepalli, Mohsen Ghasempour, Mikel Lujan, Barry Cheetham, Jarrod Homer

PMID: 26737559
DOI: 10.1109/EMBC.2015.7319659

Abstract

Voice quality assessment is required by healthcare professionals in patients suffering from voice problems. Speech and language therapists (SLTs) use a well-known subjective assessment approach which is called GRBAS, to quantify voice problems. GRBAS is an acronym for a five dimensional scale of measurements of voice properties which were originally recommended by the Japanese Society of Logopeadics and Phoniatrics and the European Research for clinical and research use. The properties are `Grade', `Roughness', `Breathiness', `Asthenia' and `Strain'. In requiring the services of trained SLTs, this subjective assessment make the GRBAS measurement expensive to administer. In this research, computerised objective measurement of `Strain' in voice using two regression prediction models is compared with measurements produced by SLTs according to the GRBAS scale. These regression models are K Nearest Neighbor Regression (KNNR) and Multiple Linear Regression (MLR). These new approaches for prediction of Strain are based on different subsets of features, different sets of data, and different prediction models in comparison with previous approaches in the literature. The best feature subset for predicting Strain objectively was obtained amongst different feature subsets. When compared with the mean of five SLT's scores, over 102 samples, the computerised measurement was found to have a Normalized Root Mean Square Error (NRMSE) averaged over 20 trials, lower than that of each individual SLT. We have achieved a NRMSE of 14.6% and 15.1% for the MLR and KNNR respectively when the best feature subsets were used for predicting Strain objectively.

MeSH terms

Humans
Speech*
Voice
Voice Disorders
Voice Quality